A Treebank of Ugaritic. Annotating Fragmentary Attested Languages
نویسنده
چکیده
The paper presents an outline of a treebank of Ugaritic, an extinct Semitic language. It describes the basic structure of the treebank, and possibility of re-using approaches applied to other Semitic languages. It also discusses problems of analyzing a language attested in a fragmentary form and possible usage of a treebank based approaches for further reconstruction of text passages.
منابع مشابه
Using the Stockholm TreeAligner
In this paper we present several use cases for the Stockholm TreeAligner, a software tool originally designed for annotating the alignments in a parallel treebank. The tool has been extended and improved to the point that it can now also serve as a general tool for browsing and searching monolingual and parallel treebanks. Among the use cases presented are: building a parallel treebank, browsin...
متن کاملAnnotating Predicate-Argument Structure for a Parallel Treebank
Abstract We report on a recently initiated project which aims at building a multi-layered parallel treebank of English and German. Particular attention is devoted to a dedicated predicate-argument layer which is used for aligning translationally equivalent sentences of the two languages. We describe both our conceptual decisions and aspects of their technical realisation. We discuss some select...
متن کاملTamilTB: An Effort Towards Building a Dependency Treebank for Tamil
Annotated corpora such as treebanks are important for the development of parsers, language applications as well as understanding of the language itself. Only very few languages possess these scarce resources. In this paper, we describe our effort in syntactically annotating a small corpora (600 sentences) of Tamil language. Our annotation is similar to Prague Dependency Treebank (PDT 2.0) and c...
متن کاملThe Tectogrammatics of English: on Some Problematic Issues from the Viewpoint of the Prague Dependency Treebank
The present paper is aimed to illustrate how the description of underlying structures carried out in annotating Czech texts may be used as a basis for comparison with a more or less parallel description of English. Specific attention is given to several points in which there are differences between the two languages that concern not only their surface or outer form, but (possibly) also their un...
متن کاملA Study of Word-Classing for MT Reordering
MT systems typically use parsers to help reorder constituents. However most languages do not have adequate treebank data to learn good parsers, and such training data is extremely time-consuming to annotate. Our earlier work has shown that a reordering model learnt from word-alignments using POS tags as features can improve MT performance (Visweswariah et al., 2011). In this paper, we investiga...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007